Programming exercises: Recommand using MATLAB
Coding.1 Implement a script that performs the Arnoldi decomposition
$$AV_{k}=V_{k}H_{k}+f_{k}e_{k}^{T}=V_{k+1}\bar{H}_{k}, \quad f_{k}=v_{k+1}\Vert f_{k} \Vert$$
The inputs are $A$, $v_{1}$, $k$, the outputs can be $V_{k}$, $H_{k}$, $f_{k}$, tey can also be $V_{k+1}$, $\bar{H}_{k}$.
     Solution:
     Krylov subspace methods are the most competitive iteraitve method scheme in present-day, in fact, most pertinant problems of present-day of large-scale linear systems found in video game's physic simulation to deep learning problems are solved utilizing preconditioned Krylov subspace solvers; Gutknecht(2006)[1], Heyn et al.(2012)[2]. The appplication of these type of solvers in profound, and there importance in this world of data and computation grow with each day. Nevertheless the objective of this project is to construct these algorithms for the purposes of academic exploration and development on my own part.
     To begin, a Krylov subpace is a space that is generated (spanned) by the images of a vector named $u$ under the subsequent powers of an matrix $A$ in question. This process generates a sequence of vectors,
$$\{u,Au,A^{2},\cdots,A^{m-1}u\}$$
these vectors are the linearly independnent vectors that form the Krylov subspace, which is a m-dimension linear subspace; Wikipedia: Krylov subspace[3]. This m-dimension linear subspace does not necessary equivalent in dimensions as the dimensions of matrix $A$ which formed it. Instead the dimensions of the Krylov subspace is determined by the mathematician and their aims for computation; as they might not want to solve the entire system as it may be computation expensive, or intractable, some mathematician might option to generate a smaller Krylov subspace to represent the larger-scale linear system that is matrix $A$.
     A significant note is that the maximum dimension that a Krylov subspace can take is determined by the number of distinct eigenvalues that are found in matrix $A$, and thus, the Krylov space might not be able to take on the full-dimensions of matrix $A$ as its eigenvalues are not all linearly independent. Though it is not nessary for the Krylov subspace to equil the full-dimensions of matrix $A$ to be able sovle the large-scale linear system that it represents. In fact the notion of preconditioning is to reduce the computational complexity by reducing the large-scale linear system to a more managable small-scale linear system, which we hope possesses a higher concentration of non-degenerate eigenvalues - so that the computational complexity can be reduced significant by this character property.
     As conveyed in Gutknecht(2006, pg. 4)[1], the notion behind Kyrlov subspace methods is that the concept is to generate a sequence, $\{x_{j}\}_{i=0}^{n}$, of approximate solutions that are contained in the Kryov subspace of which solves the large-scale linear system, $Ax=b$. Similar to non-stationary iterative methods (e.g. Projection methods), such as, conjugate residual, GMRES, et cetera, the objective to is to minimize the residual as,
$$\| b-A\tilde{x}\|_{A}=\min_{z\in K} \| x^{*}-z \|_{A}.$$
Since each component of the sequence, $\{x_{j}\}_{i=0}^{n}$, is contained in the Krylov subspace, $x_{n} \in K_{n+1}(A,b)$, so are the corresponding residuals contained in the Krylov subspace, $r_{n} \in K_{n+1}(A,b)$. With each iterative their exists an approximate solution, $x_{i}$, and a corresponding residual $r_{i}$, if all residuals are contained in the sequence of residuals, $\{r_{i}\}_{i=0}^{n}$ are linearly independent, then the iterative scheme possesses an finite termination property, and will thus converge in $n$ iterations to the exact solution of $r_{n}=0$ and $x_{n}=x^{*}$. This property can be attined from Cayley-Hamilton Theorem,
Cayley-Hamilton Theorem (1858, proved by Frobenius in 1878) Let $p(\lambda)=det(A-\lambda I)=\sum_{i=0}^{n} c_{i}\lambda^{i}$ be the charateristic polynomial of $A$. Then $p(A)=0$.
Enhanced Cayley-Hamilton Theorem Let the minimum polynomial of $A$ be a $m(y)=\sum_{i=0}^{k} c_{i}\lambda^{i}$, where $k$ is the degree of the minimum polynomial. Then $m(A)=0$.
     Now to construct Krylov space numerically, we need to import numpy as np, matplotlib.pyplot as plt, scipy as sp and pandas as pd; these are the four python packages that need to be imported for the purposes of this project. To implement an algorithm to numerically construct a Krylov subspace, we only need numpy and matplotlib.pyplot.
# Load library package
import numpy as np
import matplotlib.pyplot as plt
from scipy import sparse as sp
from numpy import linalg as LA
import pandas as pd
from scipy.sparse import csc_matrix
from scipy.sparse.linalg import spilu
     It is not necessary but we decided to create a python function to generate symmetric positive-definite matrices for our numerical experiments.
def SPD(n,m):
A=np.random.normal(size=(n,m));
A=np.transpose(A)@A;
return A;
     To form a sequence of orthonormal basis for the Krylov space, we can implement an modified Gram-schmidt to orthonormalize the sequence of vectors.
def arnoldi(A,b,x0):
m=A.shape[0]; n=A.shape[1];
H=np.zeros((m,n))
Q=np.zeros((m,n));
ritz_values = []
x0 = x0/np.linalg.norm(x0); Q[:,0] = x0;
for k in range(n):
u=A@Q[:,k];
for j in range(k+1):
qj=Q[:,j];
H[j,k]=u.T@qj ;
u=u-H[j,k]*qj;
if k+1 < n:
H[k+1,k]=np.linalg.norm(u);
Q[:,k+1]=u/H[k+1,k];
plt.spy(H)
ritz_values.append(np.linalg.eig(H)[0])
return H,Q;
     To test our Arnoldi decomposition algorithm, we generated a small-scale linear system of 5 by 5.
n=5; m=5;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
[H,Q]=arnoldi(A,b,x0)
df=pd.DataFrame(H)
df
| 0 | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| 0 | 10.408177 | 2.099063 | -8.215650e-15 | 2.317591e-14 | -2.428596e-12 |
| 1 | 2.099063 | 6.253088 | 2.486675e+00 | 1.598721e-14 | -1.609804e-12 |
| 2 | 0.000000 | 2.486675 | 4.518148e+00 | 1.043731e+00 | -8.710116e-13 |
| 3 | 0.000000 | 0.000000 | 1.043731e+00 | 8.722138e+00 | 7.872520e-02 |
| 4 | 0.000000 | 0.000000 | 0.000000e+00 | 7.872520e-02 | 2.040995e-02 |
df1=pd.DataFrame(Q)
df1
| 0 | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| 0 | -0.049389 | 0.073260 | -0.715274 | 0.664245 | 0.198382 |
| 1 | -0.793488 | -0.343895 | 0.217006 | 0.079476 | 0.445764 |
| 2 | 0.555361 | -0.130424 | 0.423781 | 0.325780 | 0.623571 |
| 3 | -0.122738 | 0.800447 | -0.068410 | -0.317080 | 0.488877 |
| 4 | -0.210826 | 0.467594 | 0.506973 | 0.588039 | -0.366192 |
np.linalg.norm(Q.T @ A @ Q - H)/ np.linalg.norm(A)
9.687385811771314e-15
np.linalg.norm(Q.T @ Q - np.eye(n))
2.1695177238518212e-14
     To test our Arnoldi decomposition more vigorously, we generated a small-scale linear system of 500 by 500.
n=500; m=500;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
[H,Q]=arnoldi(A,b,x0)
df2=pd.DataFrame(H)
df2
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 490 | 491 | 492 | 493 | 494 | 495 | 496 | 497 | 498 | 499 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 531.037382 | 507.137296 | -1.687539e-13 | 1.403322e-13 | 2.131628e-14 | -7.815970e-14 | 1.171285e-13 | -1.492140e-13 | 2.273737e-13 | -2.309264e-13 | ... | 2.031640e-08 | -2.771402e-08 | 4.101009e-08 | -2.004991e-08 | 1.587491e-08 | -1.878837e-08 | 9.888035e-09 | -2.243364e-08 | 3.678599e-08 | -1.018789e-07 |
| 1 | 507.137296 | 939.239806 | 4.731738e+02 | -5.968559e-13 | 8.668621e-13 | -1.308287e-12 | 1.678657e-12 | -1.968203e-12 | 2.415845e-12 | -2.950529e-12 | ... | 4.567488e-08 | -6.230293e-08 | 9.219130e-08 | -4.507069e-08 | 3.567772e-08 | -4.222091e-08 | 2.220163e-08 | -5.034966e-08 | 8.255045e-08 | -2.286064e-07 |
| 2 | 0.000000 | 473.173784 | 9.785258e+02 | 5.189420e+02 | 8.810730e-13 | -9.805490e-13 | 1.065814e-12 | -1.244116e-12 | 1.435962e-12 | -1.785239e-12 | ... | 3.085969e-08 | -4.209215e-08 | 6.228358e-08 | -3.044823e-08 | 2.409719e-08 | -2.851345e-08 | 1.498079e-08 | -3.396009e-08 | 5.567123e-08 | -1.541586e-07 |
| 3 | 0.000000 | 0.000000 | 5.189420e+02 | 1.077344e+03 | 5.458737e+02 | 3.481659e-13 | -4.334311e-13 | 4.511946e-13 | -4.654055e-13 | 5.151435e-13 | ... | 3.020580e-09 | -4.118224e-09 | 6.092756e-09 | -2.977479e-09 | 2.351729e-09 | -2.780155e-09 | 1.449913e-09 | -3.275076e-09 | 5.362429e-09 | -1.483951e-08 |
| 4 | 0.000000 | 0.000000 | 0.000000e+00 | 5.458737e+02 | 1.069900e+03 | 5.165634e+02 | 3.304024e-13 | -3.765876e-13 | 4.831691e-13 | -4.867218e-13 | ... | -8.073380e-09 | 1.101306e-08 | -1.629649e-08 | 7.967426e-09 | -6.308201e-09 | 7.465630e-09 | -3.928232e-09 | 8.911335e-09 | -1.461233e-08 | 4.046830e-08 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 495 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | ... | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 7.745447e+00 | 7.385889e+00 | 1.598908e+00 | -7.063794e-14 | 1.616346e-13 | -6.147513e-13 |
| 496 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | ... | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 1.598908e+00 | 9.354532e+00 | 2.784104e+00 | -2.967071e-14 | 6.891189e-15 |
| 497 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | ... | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 2.784104e+00 | 4.850156e+00 | 2.209458e+00 | -2.471287e-14 |
| 498 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | ... | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 2.209458e+00 | 3.758505e+00 | 8.705842e-01 |
| 499 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | ... | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 8.705842e-01 | 4.875842e-01 |
500 rows × 500 columns
df3=pd.DataFrame(Q.T@Q)
df3
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 490 | 491 | 492 | 493 | 494 | 495 | 496 | 497 | 498 | 499 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.000000e+00 | 4.417317e-16 | -9.378429e-16 | 1.404464e-15 | -1.886994e-15 | 2.376369e-15 | -2.946238e-15 | 3.626565e-15 | -4.492459e-15 | 5.637861e-15 | ... | -1.518865e-12 | 2.067147e-12 | -3.055939e-12 | 1.491440e-12 | -1.169352e-12 | 1.377457e-12 | -6.983069e-13 | 1.555071e-12 | -2.533678e-12 | 6.992244e-12 |
| 1 | 4.417317e-16 | 1.000000e+00 | 6.035801e-16 | -1.242219e-15 | 1.977972e-15 | -2.706646e-15 | 3.435008e-15 | -4.209888e-15 | 5.095720e-15 | -6.354062e-15 | ... | 4.165139e-11 | -5.681255e-11 | 8.406585e-11 | -4.109708e-11 | 3.252746e-11 | -3.849027e-11 | 2.022900e-11 | -4.586418e-11 | 7.518971e-11 | -2.082121e-10 |
| 2 | -9.378429e-16 | 6.035801e-16 | 1.000000e+00 | -1.590824e-16 | 6.466983e-17 | 2.698522e-17 | -1.612203e-16 | 2.654333e-16 | -3.366847e-16 | 3.584490e-16 | ... | 1.547963e-11 | -2.111405e-11 | 3.124247e-11 | -1.527342e-11 | 1.208782e-11 | -1.430322e-11 | 7.514921e-12 | -1.703560e-11 | 2.792678e-11 | -7.733182e-11 |
| 3 | 1.404464e-15 | -1.242219e-15 | -1.590824e-16 | 1.000000e+00 | -2.522436e-16 | 4.756396e-16 | -7.926661e-16 | 9.929708e-16 | -1.279208e-15 | 1.610800e-15 | ... | -7.699962e-12 | 1.050355e-11 | -1.554258e-11 | 7.598790e-12 | -6.016477e-12 | 7.120579e-12 | -3.747126e-12 | 8.500845e-12 | -1.393911e-11 | 3.860361e-11 |
| 4 | -1.886994e-15 | 1.977972e-15 | 6.466983e-17 | -2.522436e-16 | 1.000000e+00 | -3.640405e-16 | 9.233583e-16 | -1.406792e-15 | 1.996665e-15 | -2.631347e-15 | ... | 6.014372e-12 | -8.201944e-12 | 1.213547e-11 | -5.931665e-12 | 4.690940e-12 | -5.548778e-12 | 2.907381e-12 | -6.582079e-12 | 1.078504e-11 | -2.985700e-11 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 495 | 1.377457e-12 | -3.849027e-11 | -1.430322e-11 | 7.120579e-12 | -5.548778e-12 | 1.842047e-11 | -1.332360e-11 | 1.461547e-11 | 6.163669e-12 | 2.414930e-11 | ... | -2.119832e-15 | -1.291849e-14 | 5.824334e-16 | 1.917737e-15 | -1.951564e-16 | 1.000000e+00 | 1.028908e-15 | -2.966377e-15 | 5.259790e-15 | -1.520745e-14 |
| 496 | -6.983069e-13 | 2.022900e-11 | 7.514921e-12 | -3.747126e-12 | 2.907381e-12 | -9.666555e-12 | 7.008853e-12 | -7.678354e-12 | -3.218289e-12 | -1.267904e-11 | ... | 3.567242e-15 | 2.140161e-14 | -9.897140e-16 | -3.122014e-15 | 3.809127e-15 | 1.028908e-15 | 1.000000e+00 | 3.014082e-17 | -2.476928e-16 | 9.715536e-16 |
| 497 | 1.555071e-12 | -4.586418e-11 | -1.703560e-11 | 8.500845e-12 | -6.582079e-12 | 2.190075e-11 | -1.589786e-11 | 1.740546e-11 | 7.273056e-12 | 2.873226e-11 | ... | -1.071972e-14 | -6.430100e-14 | 3.018853e-15 | 9.306791e-15 | -1.274631e-14 | -2.966377e-15 | 3.014082e-17 | 1.000000e+00 | -3.827234e-17 | 1.111090e-15 |
| 498 | -2.533678e-12 | 7.518971e-11 | 2.792678e-11 | -1.393911e-11 | 1.078504e-11 | -3.589540e-11 | 2.606688e-11 | -2.853257e-11 | -1.191064e-11 | -4.709555e-11 | ... | 1.900422e-14 | 1.142412e-13 | -5.302209e-15 | -1.649814e-14 | 2.312609e-14 | 5.259790e-15 | -2.476928e-16 | -3.827234e-17 | 1.000000e+00 | 1.541031e-15 |
| 499 | 6.992244e-12 | -2.082121e-10 | -7.733182e-11 | 3.860361e-11 | -2.985700e-11 | 9.938682e-11 | -7.218897e-11 | 7.900797e-11 | 3.296219e-11 | 1.304024e-10 | ... | -5.474961e-14 | -3.299626e-13 | 1.517710e-14 | 4.771791e-14 | -6.757789e-14 | -1.520745e-14 | 9.715536e-16 | 1.111090e-15 | 1.541031e-15 | 1.000000e+00 |
500 rows × 500 columns
np.linalg.norm(Q.T @ A @ Q - H)/ np.linalg.norm(A)
6.801528514985873e-11
np.linalg.norm(Q.T @ Q - np.eye(n))
1.5239652052575796e-09
     One can observe the lost of numerical accuracy in the Arnoldi factorization itself, and a significiant lost in numerical accuracy in the orthogonality of the orthonormal basis of the Krylov space. (explain further).
def arnoldiDGKS(A,b,x0):
m=A.shape[0]; n=A.shape[1];
H=np.zeros((m,n))
Q=np.zeros((m,n));
ritz_values = []
x0 = x0/np.linalg.norm(x0); Q[:,0] = x0;
#for k in range(n):
# u=A@Q[:,k];
# for j in range(k+1):
# qj=Q[:,j];
# H[j,k]=u.T@qj ;
# u=u-H[j,k]*qj;
#
# if k+1 < n:
# H[k+1,k]=np.linalg.norm(u);
# Q[:,k+1]=u/H[k+1,k];
#
# plt.spy(H)
# ritz_values.append(np.linalg.eig(H)[0])
for j in range(m):
w=A@Q[:,j]; nrmw=np.linalg.norm(w);
H[0:j,j]=Q[:,0:j].T@w;
w=w-Q[:,0:j]@H[0:j,j];
H[j+1,j]=np.linalg.norm(w);
if H[j+1,j] < 0.717*nrmw:
Htmp=Q[:,0:j].T*w;
w=w-Q[:,0:j]*Htmp;
H[0:j,j]=H[0:j,j]+Htmp; H[j+1,j]=np.linalg.norm(w);
Q[:,j+1]=w/H[j+1,j]
return H,Q;
n=5; m=5;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
[H,Q]=arnoldiDGKS(A,b,x0)
df2=pd.DataFrame(H)
df2
df3=pd.DataFrame(Q.T@Q)
df3
np.linalg.norm(Q.T @ A @ Q - H)/ np.linalg.norm(A)
np.linalg.norm(Q.T @ Q - np.eye(n))
Lastly
Coding.2 Implement the restart FOM and GMRES methods, you can call the Arnoldi decomposition you have implemented.
For both FOM and GMRES, you should not use "\" to solve the projected problem. Instead you should code the Givens rotation and triangular solvers to solve the projected problems for both FOM and GMRES, and use the by-product for measuring convergence without computing the residual norms. (But for debugging/testing of your code, you can use "\" to solve the projected problem, just for comparison with your non-"\" implementations.)
Note that if you set the maximum subspace dimension to tghe matrix size $n$, then the code should perform as if no restarted is used, otheriwse, your code should be able to perform a restart with $m < n$.
     Solution:
     I spend the majority of my time trying to debug this implementation of GMRES, though I was quite unsuccessful. This implementation function as expected in MATLAB, but an equivalent python implementation results in a intractable singular matrix, matrix $x$ solution or out of bounds error. Simply this implementation of GMRES is not suited for Python, and an entirely different approach is needed to implement a functioning GMRES script in Python.
def rotmat(a,b):
if b == 0.0:
c=1.0;
s=0.0;
elif np.abs(b) > np.abs(a):
temp=a/b;
s=1.0/np.sqrt(1.0+temp**2);
c=temp*s;
else:
temp=b/a;
c=1.0/np.sqrt(1.0+temp**2);
s=temp*c;
return c,s;
def gmres(A,x,b,restrt,maxIter,tol):
iter=0;
flag=0;
bnrm2=np.linalg.norm(b);
if bnrm2 == 0.0:
bnrm2=1;
r=b-A@x;
error=np.linalg.norm(r)/bnrm2;
if error < tol:
return x,error, iter, flag;
m=A.shape[0]; n=A.shape[0];
V=np.zeros((n,m+1));
H=np.zeros((m+1,m));
cs=np.zeros((m,1));
sn=np.zeros((m,1));
e1=np.zeros((n,1));
e1[0]=1.0;
for iter in range(maxIter):
r=b-A@x;
V[:,0]=r/np.linalg.norm(r);
s=np.linalg.norm(r)*e1;
for i in range(m-1):
w=A@V[:,i];
for k in range(i):
H[k,i]=w.T@V[:,k];
w=w-H[k,i]*V[:,k];
if i+1 < n:
H[i+1,i]=np.linalg.norm(w);
V[:,i+1]=w/H[i+1,i];
for k in range(i):
temp=cs[k]*H[k,i]+sn[k]*H[k+1,i];
H[k+1,i]=-sn[k]*H[k,i]+cs[k]*H[k+1,i];
H[k,i]=temp;
[cs[i],sn[i]]=rotmat(H[i,i],H[i+1,i]);
temp=cs[i]*s[i];
s[i+1]=-sn[i]*s[i];
s[i]=temp;
H[i,i]=cs[i]*H[i,i]+sn[i]*H[i+1,i];
H[i+1,i]=0.0;
error=np.abs(s[i+1])/bnrm2;
if error <= tol:
y=np.linalg.solve(H,s);
x=x+V@y;
break;
if error <= tol:
break;
y=np.linalg.solve(H,s);
x=x+V@y;
r=b-A@x;
s[i+1]=np.linalg.norm(r);
error=s[i+1]/bnrm2;
if error <= tol:
break;
if error < tol:
flag=1
return x,error,iter,flag;
n=100;m=100;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
restrt=900;
maxIter=5000
tol=1e-10;
[x,error,iter,flag]=gmres(A,x0,b,restrt,maxIter,tol)
error
np.linalg.norm(b-A@x)
     After a significant time investment in the last version of implementation, I decide to derive GMRES by hand, and then implement it in Python code; the results of my toils is as follows.
def givensrotation(a,b):
if b==0:
c=1;
s=0;
else:
if np.abs(b) > abs(a):
r=a/b;
s=1/np.sqrt(1+r**2);
c=s*r;
else:
r=b/a;
c=1/np.sqrt(1+r**2);
s=c*r;
return c,s;
def gmres(A,x0,tol):
iter=0; x=x0;
bnrm2=np.linalg.norm(b);
if bnrm2==0.0:
bnrm2=1;
r=b-A@x0;
error=np.linalg.norm(r)/bnrm2;
if error < tol:
iter=1+iter;
return x,error,iter;
m=A.shape[0]; n=A.shape[1];
H=np.zeros((m,n))
Q=np.zeros((m,n));
e1=np.zeros((n,1));
e1[m-1]=1.0;
S=np.zeros((m,1))
S=np.linalg.norm(r)*e1;
ritz_values = [];
x0 = x0/np.linalg.norm(x0); Q[:,0] = x0;
for k in range(n):
u=A@Q[:,k];
for j in range(k+1):
qj=Q[:,j];
H[j,k]=qj @ u;
u=u-H[j,k]*qj;
if k+1 < n:
H[k+1,k]=np.linalg.norm(u);
Q[:,k+1]=u/H[k+1,k];
plt.spy(H);
ritz_values.append(np.linalg.eig(H)[0]);
df=pd.DataFrame(H)
print(df)
error1=np.zeros((2,1));
I=np.eye(m);
error1[0,0]=np.linalg.norm(Q.T@A@Q-H)/ np.linalg.norm(A);
error1[1,0]=np.linalg.norm(Q.T@Q-np.eye(n));
print("Factorization error = " + str(error1[0,0]) + " " + "Error of Q = " + str(error1[1,0]))
V=np.eye(m);
R=H;
for j in range(n-1):
for i in range(j+1,j,-1):
G=np.eye(m);
[c,s]=givensrotation(H[i-1,j],H[i,j]);
G[i-1,i-1]=c; G[i-1,i]=-s;
G[i,i-1]=s; G[i,i]=c;
df1=pd.DataFrame(G)
df1
H=G.T@H;
V=V@G
return H,V;
n=5;m=5;
A=SPD(n,m);
b=np.random.randn(n);
x0=np.random.randn(m);
restrt=900;
maxIter=5000
tol=1e-10;
[H,V]=gmres(A,x0,tol)
0 1 2 3 4 0 3.014705 1.303124 6.661338e-16 2.220446e-16 -9.228729e-16 1 1.303124 4.764382 4.602523e+00 4.440892e-16 -2.595146e-15 2 0.000000 4.602523 8.681672e+00 3.860315e+00 5.440093e-15 3 0.000000 0.000000 3.860315e+00 5.178227e+00 9.347696e-01 4 0.000000 0.000000 0.000000e+00 9.347696e-01 9.009698e-01 Factorization error = 7.956301465264818e-16 Error of Q = 3.215827389821516e-15
df1=pd.DataFrame(H)
df1
| 0 | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| 0 | 3.284292e+00 | 3.086544e+00 | 1.826164e+00 | 3.800216e-16 | -1.876808e-15 |
| 1 | -1.065805e-16 | 6.004492e+00 | 9.367859e+00 | 2.958982e+00 | 2.875200e-15 |
| 2 | 6.588413e-17 | 9.892409e-17 | 4.512763e+00 | 5.713627e+00 | 7.996220e-01 |
| 3 | -5.601009e-17 | -8.409835e-17 | -1.244649e-16 | 1.090297e+00 | 1.021654e+00 |
| 4 | 9.329269e-17 | 1.400777e-16 | 2.073138e-16 | 8.159767e-17 | 4.866962e-02 |
     I was able to attain the hessenberg matrix, and was able to transform away the inferior diagonal of the hessenberg matrix to form an upper triangular matrix. The last major challenge in which I was not able to resolve due to running out of time was properly constructing the beta vector use to solve for the y solution of the upper triangular matrix. If I had a little more time, I would have been able to resolve this issue.
    Â
temp=c*S[i];
S[i-1]=-s*S[i];
S[i]=temp;
error=np.abs(S[i])/bnrm2;
if error < tol:
y=np.linalg.solve(H,S);
x=x+Q@y;
break;
df=pd.DataFrame(H)
print(df)
error2=np.zeros((2,1));
error2[0,0]=np.linalg.norm(R-(V@H));
error2[1,0]=np.linalg.norm(I-(V.T@V));
print("Factorization error = " + str(error2[0,0]) + " " + "Error of Orthogonal matrix = " + str(error2[1,0]));
return x,error;
     Unfortuntely I ran out of time, and was unable to write this last portion of the script to properly construct the beta vector, so to attain the approximate solution $x$.
     Due to running out of time, I was unable to implement FOM. Though Dr. Zhou did provided an efficient MATLAB implementation of GMRES.
function [x, residual, iter, mvcnt] = GMRes(A, b, m, opts);
%
% non-preconditioned GMRes linear equation solver to solve A*x = b for x
%
% Input: A -- the matrix
% b -- the right hand side
% m -- the allowed max dimension with restart (maybe automatically adapted)
% opts -- options for the solver, including
% tol -- stopping error tolerance
% ITmax -- maximum iteration number
% x0 -- initial solution (if chosen)
% x_ext -- exact solution (if available)
%
% Output:
% x -- found solution within ITmax iterations
% residual -- vector of residual norms for the history of iteration
% iter -- total number of iterations used
% mvcnt -- number of matrix-vector products
%
%
% (-ykz, for spring semester 2022)
%
DEBUG = 0;
%DEBUG = 1; %uncomment this line to see debugging printouts
n = size(A,1); mvcnt = 0;
if (nargin >= 4),
if isfield(opts,'x0'), x0=opts.x0; end
if isfield(opts,'ITmax'), ITmax=opts.ITmax; end
if isfield(opts,'tol'), tol=opts.tol; end
if isfield(opts,'x_ext'), x_ext=opts.x_ext; end
end
% in case opts does not contain necessary fields, set them explicitly
if ~exist('x0'),
x0 = zeros(n,1); r0 = b;
else
r0 = b - A*x0; mvcnt = mvcnt +1;
end
if ~exist('ITmax'), ITmax=max(1000, 2*n); end
if ~exist('tol'), tol = 1e-10; end
if ~exist('print_step'), print_step=2; end
if exist('x_ext'), %double check if the input x_ext is accurate or not
err_ext = norm(A*x_ext -b);
if err_ext > 1e-11*norm(b),
fprintf('input exact solution not exact, err=%e, not used for verification\n', err_ext)
EXACT_sol = 0;
else
EXACT_sol = 1;
end
end
residual = [];
normb = norm(b);
beta = norm(r0);
for iter = 1: ITmax,
% call arnoldi to generate AV = VH + fe_m', or equivalently
% A * V(:,1:m) = V(:,1:m+1) * H(1:m+1,1:m)
%
[H, V] = arnoldi(A, m, r0);
mvcnt = mvcnt + m;
if DEBUG==1, %when DEBUG is 1, the following LS solve takes extra time, should avoid for real computation
ymd = beta* (H(1:m+1,1:m) \ eye(m+1,1)); % (use direct method only for testing and comparison)
xmd = x0 + V(:,1:m)*ymd;
end
%
% apply Givens: to save computation for speedup, it is important NOT to form or compute a full
% size-(m+1) unitary matrix Q.
% the multiplication of the Givens rotation should be applied only to the two rows/columns
% that need to be updated (the rest should not need any update. this avoids a huge amount of
% computations involving multiplication by 0's)
%
gv = eye(m+1,1); %gv is the vector to store Q*e_1 (where Q is the product of all Givens rotators,
%note: for faster computation, the Q matrix should never be formed explicitly)
% Vtmp = V; %using another V matrix should be strictly avoided in real computation
vcol = V(:,1); %in the loop, vcol will be repeatedlt updated to store the last column of V*Q'
for i = 1 : m,
tmp = sqrt(H(i,i)^2 + H(i+1,i)^2);
c = H(i,i) / tmp;
s = H(i+1,i) / tmp;
htmp = H(i, i:m);
H(i, i:m) = c*htmp + s*H(i+1, i:m);
H(i+1, i:m) = -s'*htmp + c'*H(i+1, i:m); %the complex-conjugate is necessary if A is complex
gtmp = gv(i);
gv(i) = c*gtmp + s*gv(i+1);
gv(i+1) = -s'*gtmp + c'*gv(i+1);
%%need to update V(:,1:m+1) to get the last column for the residual vector,
%%but V(:,1:m) needs to be unchanged for constructing x; so a clumsy way is to use Vtmp=V
%%before this loop is entered, then update as the following to apply G' to Vtemp from the right:
% vtmp = Vtmp(:,i);
% Vtmp(:, i) = c'*vtmp + s'*Vtmp(:, i+1);
% Vtmp(:, i+1) = -s*vtmp + c*Vtmp(:, i+1);
%
%a better and much more efficient way is to use a temp vector (vcol) to store the most recent last column
%in the matrix produce V*Q' (since computing the residual vector only needs the last column in V*Q'. and
%the last column of V*Q' is the iterated update of the 'last' updated column of V*G' inside this loop.
%there is no need to compute V*Q' at all, not even the need to form Q)
%
vcol = -s*vcol + c*V(:, i+1); %only keeps the 'last' updated column of V*G' for each Givens rotation
end
%should use backward-solve for the upper-triangular system
ym = beta * triu_solve(triu(H(1:m,1:m)), gv(1:m));
x = x0 + V(:,1:m)*ym;
resid = beta * abs(gv(m+1)); %resid = the true residual ||b - A*x||_2, the latter need not be computed
residual = [residual; resid];
if DEBUG==1,
ydiff = norm(ymd - ym);
true_resid = norm(A*x - b);
fprintf('it=%i, ydiff=%e, resid=%e, true_resid=%e\n', iter, ydiff, resid, true_resid)
end
if resid < tol*normb,
break;
else %restart
x0 = x; %set the most recent solution as the next initial
%r0 = beta*gv(m+1)*Vtmp(:,m+1); %note: r0 should not come from forming the matrix Vtmp=V*Q',
r0 = beta*gv(m+1)*vcol; %r0 can be obtained from the last column of V*Q', already stored in vcol
betaold = beta;
beta = resid; %update beta to be the norm of the current r0: beta = ||r0|| = ||b-Ax||
if DEBUG==1, %debugging tests
fprintf('compare r0 with true residual: || r0 - (b-Ax) || = %e\n', norm(b- A*x - r0))
fprintf('compare beta with true |r0|: |r0| = %e, beta=%e\n', norm(r0), beta)
end
%the following is not in the standard restarted GMres, here we add this to introduce some adpativeness
%if the residual stagnates, then automatically increase m by 10 (can skip this step by commenting it out,
%in that case u may not to manually increase m by a re-run if the input m was too small)
if abs(betaold - beta) < normb*tol,
m = min(m+10, n);
if DEBUG==1, fprintf(' --> Restart seems to stagnate, increase m to %i\n',m); end
end
end
end
%=========================================================
function [x, err] = triu_solve(R, b)
%
% solve R x = b for R upper triangular
%
[m, n]=size(R);
x = zeros(n,1);
x(n) = b(n)/R(n,n);
for i = n-1 : -1 : 1
x(i) = ( b(i) - R(i, i+1:n)x(i+1:n) )/R(i,i);
end
if (nargout > 1),
err = norm(Rx - b);
end
%=========================================================
function [H, V, Error] = arnoldi(A, k, v0)
%
% Usage: [H, V] = arnoldi(A, k, v0)
%
% Input: A -- a square n by n matrix
% k -- an integer specifying how many Arnoldi steps to take,
% (if k=n, a full Hessenberg decomposition is
% constructed; in case arnoldi breaks down, random
% vectors are used to continue the iteration.)
% v0 -- (optional) initial vector
%
% Output: H -- if k<n, the k+1 by k Hessenberg matrix
% else, the n by n Hessenberg matrix
% V -- if k<n, the n by k+1 orthogonal matrix,
% else, the n by n orthogonal matrix
%
%
% This script constructs a k-step Arnoldi decomposition
%
% A * V(:,1:k) = V(:,1:k+1) * H(1:k+1,1:k)
%
% using CGS with DGKS reorthogonalization.
%
n = size(A, 1); k = min(k, n);
V = zeros(n, k+1);
H = zeros(k+1, k);
if (nargin < 3), v0 = rand(n,1); end
nrmv= norm(v0);
if nrmv > 0;
V(:,1) = v0/nrmv;
else
v0 = rand(n,1);
V(:,1) = v0/norm(v0);
end
irand = 0; % count # of break downs (# of random vectors used)
for j = 1 : k
f = A * V(:,j);
h = V(:,1:j)'*f;
f = f - V(:,1:j)*h;
% DGKS reorthogonalization
c = V(:,1:j)'*f;
f = f - V(:,1:j)*c;
H(1:j,j) = h + c;
H(j+1,j) = norm(f);
if abs(H(j+1,j)) > 5e-16
V(:,j+1) = f / H(j+1,j);
else % an invariant subspace found
H(j+1,j) = 0;
if k < n,
break, % exit (even if j < k)
else
% continue with a new random vector as V(:,j+1),
% always do a reorthogonalization in this case
irand = irand + 1;
f = rand(n,1);
h = V(:,1:j)'*f;
f = f - V(:,1:j)*h;
h = V(:,1:j)'*f;
f = f - V(:,1:j)*h;
V(:,j+1) = f /norm(f);
end
end
end
if (nargout > 2),
Error = norm(AV(:,1:j) - V(:,1:j+1)H)
%normf = norm(f)
fprintf('Inside Arnoldi, number of random vectors used = %d \n', irand)
end
</code>
Coding.3 Implement the restart FOM and GMRES methods, both using preconditioner provided by the Matlab's $\text{ilu}$ command (check the $\text{ilu}$ documentation for usage, you can use the most common zero-fill $\text{ilu}$).
     Solution:
     Since I was not able to properly implement GMRES or FOM; I was not able to derive and implement the preconditioned version of these iterative schemes.
# empy code block - unfortunately was not able to code the preconditioned version of GMRES and FOM
# Though in a later date, I will update this Jupyter notebook to include them. (\(0w0)/)
Coding.4 Impement the CG (e.g. conjugate gradient method) and CR (e.g. conjugate residual method). For the preconditioned versions, use right preconditioning with M-inner product, and split preconditioning. You may decide (on your own) which $M$ matrices to use, for example, you may use imcomplete Cholesky decomposition (check the $\text{ichol}$ documentation for usage).
     Solution:
     Here is our implementation of conjugate gradient method.
def conjgrad(A,b,x,tol):
r=b-A@x;
p=r; i=0;
while np.linalg.norm(r) >= tol*np.linalg.norm(b):
z=A@p;
gamma=r.T@r;
alpha=gamma/(z.T@p);
x=x+alpha*p;
r=r-alpha*z;
beta=(r.T@r)/gamma;
p=r+beta*p;
i=i+1;
if (i % 5) ==0:
error=np.linalg.norm(b-A@x);
print("Iteration = " + str(i) + ", " + "Error = " + str(error));
return x
n=300; m=300;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
x=conjgrad(A,b,x,tol)
Iteration = 5, Error = 329.08661196858486 Iteration = 10, Error = 103.35205011499148 Iteration = 15, Error = 60.71132300184745 Iteration = 20, Error = 33.87950206609813 Iteration = 25, Error = 21.113747370360837 Iteration = 30, Error = 21.339664850656508 Iteration = 35, Error = 17.890721450247643 Iteration = 40, Error = 12.617338938494747 Iteration = 45, Error = 11.96847041346811 Iteration = 50, Error = 10.815918772882162 Iteration = 55, Error = 10.58591252138031 Iteration = 60, Error = 10.001922857106592 Iteration = 65, Error = 8.89476530259825 Iteration = 70, Error = 8.322485601554021 Iteration = 75, Error = 9.208512681072486 Iteration = 80, Error = 11.280959870159133 Iteration = 85, Error = 10.35141488404301 Iteration = 90, Error = 9.878538754965298 Iteration = 95, Error = 12.591077477098766 Iteration = 100, Error = 11.252416429304523 Iteration = 105, Error = 10.707231886365278 Iteration = 110, Error = 8.684766498542459 Iteration = 115, Error = 10.698214828427902 Iteration = 120, Error = 9.913376003438428 Iteration = 125, Error = 10.507085545637388 Iteration = 130, Error = 15.354914419840851 Iteration = 135, Error = 15.772680131651379 Iteration = 140, Error = 16.74230924418639 Iteration = 145, Error = 22.13042544309019 Iteration = 150, Error = 19.656077727479623 Iteration = 155, Error = 21.05627772793654 Iteration = 160, Error = 15.437889690921999 Iteration = 165, Error = 12.151484773237511 Iteration = 170, Error = 14.560162686845661 Iteration = 175, Error = 12.088535189876435 Iteration = 180, Error = 9.35166189682628 Iteration = 185, Error = 8.578890289479563 Iteration = 190, Error = 7.500219622687958 Iteration = 195, Error = 7.997548594797461 Iteration = 200, Error = 8.013255218322083 Iteration = 205, Error = 5.678538300779145 Iteration = 210, Error = 5.546014920178971 Iteration = 215, Error = 5.842672436820272 Iteration = 220, Error = 6.965965769121123 Iteration = 225, Error = 7.163732287880487 Iteration = 230, Error = 6.196180189601845 Iteration = 235, Error = 4.792551385818861 Iteration = 240, Error = 7.1262484000568715 Iteration = 245, Error = 5.832209981963056 Iteration = 250, Error = 4.519430172498435 Iteration = 255, Error = 5.935717011987168 Iteration = 260, Error = 5.46999567567788 Iteration = 265, Error = 4.322404406344624 Iteration = 270, Error = 4.277957078103739 Iteration = 275, Error = 3.739801461002808 Iteration = 280, Error = 3.193081322605647 Iteration = 285, Error = 5.73971342898265 Iteration = 290, Error = 6.758945781531908 Iteration = 295, Error = 7.330807650238632 Iteration = 300, Error = 6.939740790211235 Iteration = 305, Error = 6.194992613209219 Iteration = 310, Error = 7.158406348034193 Iteration = 315, Error = 7.562202756062955 Iteration = 320, Error = 5.40592363434395 Iteration = 325, Error = 6.542282717935332 Iteration = 330, Error = 7.169677950464694 Iteration = 335, Error = 3.962111020949104 Iteration = 340, Error = 5.2348374549368115 Iteration = 345, Error = 8.073775565580316 Iteration = 350, Error = 4.759500171186901 Iteration = 355, Error = 6.729676903625017 Iteration = 360, Error = 6.482642239036664 Iteration = 365, Error = 3.7751934345963893 Iteration = 370, Error = 3.8433506260105004 Iteration = 375, Error = 7.461156204238502 Iteration = 380, Error = 7.300171532819365 Iteration = 385, Error = 6.201040630989414 Iteration = 390, Error = 10.180529421067957 Iteration = 395, Error = 8.43032781544206 Iteration = 400, Error = 6.224706430576621 Iteration = 405, Error = 3.5230343877951045 Iteration = 410, Error = 2.295941896642838 Iteration = 415, Error = 3.0109919050352785 Iteration = 420, Error = 3.6162854127313655 Iteration = 425, Error = 2.54948640247526 Iteration = 430, Error = 2.2566040888374586 Iteration = 435, Error = 1.8333958182730972 Iteration = 440, Error = 1.3974518135412186 Iteration = 445, Error = 2.027290118565887 Iteration = 450, Error = 1.0191795188613995 Iteration = 455, Error = 2.148497094368834 Iteration = 460, Error = 2.2102028376793936 Iteration = 465, Error = 1.4820361590407793 Iteration = 470, Error = 1.2779623415848673 Iteration = 475, Error = 3.119464458166037 Iteration = 480, Error = 3.2182022687576053 Iteration = 485, Error = 1.9868234627066905 Iteration = 490, Error = 1.5839441385597568 Iteration = 495, Error = 0.23998738373668946 Iteration = 500, Error = 0.7322967491678848 Iteration = 505, Error = 0.0577998929992923 Iteration = 510, Error = 0.08312568126423851 Iteration = 515, Error = 0.00315626392139457 Iteration = 520, Error = 6.617439722312629e-05 Iteration = 525, Error = 3.7197703559264957e-06 Iteration = 530, Error = 2.590056173379263e-07 Iteration = 535, Error = 7.82623121881804e-09
     Here is our implementation of conjugate residual method.
def conjres(A,b,x,tol):
r=b-A@x; p=r; Ar=A@r;
Ap=Ar; gamma=r.T@Ar;
i=0;
while np.sqrt(np.abs(gamma)) >= tol*np.linalg.norm(b):
alpha=gamma/(Ap.T@Ap);
x=x+alpha*p;
r=r-alpha*Ap;
Ar=A@r;
gamma0=gamma; gamma=r.T@Ar;
beta=gamma/gamma0;
p=r+beta*p;
Ap=Ar+beta*Ap;
i=i+1;
if (i % 5) ==0:
error=np.linalg.norm(b-A@x);
print("Iteration = " + str(i) + ", " + "Error = " + str(error));
return x;
n=300; m=300;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
x=conjres(A,b,x,tol)
Iteration = 5, Error = 246.61816163592079 Iteration = 10, Error = 69.25904848516981 Iteration = 15, Error = 28.294191011491186 Iteration = 20, Error = 14.966816512453699 Iteration = 25, Error = 9.50909126174186 Iteration = 30, Error = 6.105705090670038 Iteration = 35, Error = 4.510792116031515 Iteration = 40, Error = 3.8523015004611563 Iteration = 45, Error = 3.4557925005346464 Iteration = 50, Error = 3.0462070120665543 Iteration = 55, Error = 2.7554112642166446 Iteration = 60, Error = 2.523984773676935 Iteration = 65, Error = 2.3280081862661137 Iteration = 70, Error = 2.1866180967130355 Iteration = 75, Error = 2.092913163419333 Iteration = 80, Error = 1.9813775932941204 Iteration = 85, Error = 1.840988670957213 Iteration = 90, Error = 1.7104039370471609 Iteration = 95, Error = 1.6052064113836424 Iteration = 100, Error = 1.5322679538657473 Iteration = 105, Error = 1.463258878501897 Iteration = 110, Error = 1.4007655879022005 Iteration = 115, Error = 1.3375220281489015 Iteration = 120, Error = 1.261870617846448 Iteration = 125, Error = 1.207580065896284 Iteration = 130, Error = 1.17950423360681 Iteration = 135, Error = 1.1569277656445023 Iteration = 140, Error = 1.127191094360732 Iteration = 145, Error = 1.1002366826217296 Iteration = 150, Error = 1.0740930915037834 Iteration = 155, Error = 1.0547805582139962 Iteration = 160, Error = 1.030692033803644 Iteration = 165, Error = 1.0059122937117215 Iteration = 170, Error = 0.9833778156919724 Iteration = 175, Error = 0.9635532369567041 Iteration = 180, Error = 0.9401464529478809 Iteration = 185, Error = 0.9189605124494051 Iteration = 190, Error = 0.8917645546088003 Iteration = 195, Error = 0.8646936012737189 Iteration = 200, Error = 0.8330431205056137 Iteration = 205, Error = 0.8007438378548217 Iteration = 210, Error = 0.7771421427733263 Iteration = 215, Error = 0.7566808470453751 Iteration = 220, Error = 0.7382243995395192 Iteration = 225, Error = 0.7287223294052768 Iteration = 230, Error = 0.7212925178908469 Iteration = 235, Error = 0.7140557425472707 Iteration = 240, Error = 0.7069612086673008 Iteration = 245, Error = 0.700586085204167 Iteration = 250, Error = 0.6960355341461008 Iteration = 255, Error = 0.6909760465912177 Iteration = 260, Error = 0.6823923079620985 Iteration = 265, Error = 0.674698746214856 Iteration = 270, Error = 0.6673960810840052 Iteration = 275, Error = 0.6539559051769905 Iteration = 280, Error = 0.6440604691143629 Iteration = 285, Error = 0.6321487501601334 Iteration = 290, Error = 0.6221218667073166 Iteration = 295, Error = 0.6151529054048974 Iteration = 300, Error = 0.6076927808633997 Iteration = 305, Error = 0.6003258731362234 Iteration = 310, Error = 0.5929105414011019 Iteration = 315, Error = 0.5877887002284325 Iteration = 320, Error = 0.5797824158471403 Iteration = 325, Error = 0.5705507107254337 Iteration = 330, Error = 0.5649441248933335 Iteration = 335, Error = 0.5545108519946231 Iteration = 340, Error = 0.5462066558909578 Iteration = 345, Error = 0.5404048504433753 Iteration = 350, Error = 0.5352411944060003 Iteration = 355, Error = 0.5254924585950316 Iteration = 360, Error = 0.5072963155919521 Iteration = 365, Error = 0.4953793994295423 Iteration = 370, Error = 0.48917093032957737 Iteration = 375, Error = 0.4790343881146902 Iteration = 380, Error = 0.47214937886607883 Iteration = 385, Error = 0.46626669559204587 Iteration = 390, Error = 0.46012104915556273 Iteration = 395, Error = 0.4569308939933477 Iteration = 400, Error = 0.45484149831709897 Iteration = 405, Error = 0.4520701898567687 Iteration = 410, Error = 0.4497179331815779 Iteration = 415, Error = 0.4468212698313619 Iteration = 420, Error = 0.44294343893283855 Iteration = 425, Error = 0.4406609204317226 Iteration = 430, Error = 0.4393012703617329 Iteration = 435, Error = 0.4374341237973073 Iteration = 440, Error = 0.43488504771081227 Iteration = 445, Error = 0.43056551116973 Iteration = 450, Error = 0.4267116965381107 Iteration = 455, Error = 0.4085429160264582 Iteration = 460, Error = 0.3421769840869736 Iteration = 465, Error = 0.289891096894338 Iteration = 470, Error = 0.24187399564087206 Iteration = 475, Error = 0.18587717375169202 Iteration = 480, Error = 0.1044808650272657 Iteration = 485, Error = 0.07154392502450029 Iteration = 490, Error = 0.06518698297353913 Iteration = 495, Error = 0.062018480132628194 Iteration = 500, Error = 0.05769907333507009 Iteration = 505, Error = 0.04644110453476088 Iteration = 510, Error = 0.04390129124558783 Iteration = 515, Error = 0.0437685638464659 Iteration = 520, Error = 0.00834794207329096 Iteration = 525, Error = 0.0003124260834526587 Iteration = 530, Error = 7.115549881818357e-06 Iteration = 535, Error = 4.6721950949289566e-07 Iteration = 540, Error = 6.98415344662874e-08 Iteration = 545, Error = 2.094777066994407e-08 Iteration = 550, Error = 1.3097056056604988e-08 Iteration = 555, Error = 6.450841780346827e-09 Iteration = 560, Error = 3.6221201043820335e-09 Iteration = 565, Error = 2.098173989325643e-09 Iteration = 570, Error = 1.1831270140132863e-09 Iteration = 575, Error = 6.076832936005906e-10 Iteration = 580, Error = 3.673515500073521e-10
     The python package name Scipy is the predominant sparse linear algebra package available to Python users. The major limitation with this python package is it useage of proprietary data types and structure types; it is a serious handicap that prevent widespread use of it. In this case specifically, utilization of the incomplete cholesky decomposition available in Scipy resulted in a matrix with a non-convertable proprietary data type. Because the resulting precondition matrix possessed an non-convertable proprietary data type, it was unable to be used in further calculations, as numpy did not contain the support for it.
     As a consequence I had to derive and wrote my own incomplete cholesky decomposition to produce a precondition matrix for the precondition version of CG and CR methods.
def ichol(M):
n=M.shape[0];
for k in range(n):
M[k,k]=np.sqrt(np.abs(M[k,k]));
for i in range((k+1),n):
if M[i,k]!=0:
M[i,k]=M[i,k]/M[k,k];
for j in range((k+1),n):
for i in range(j,n):
if M[i,k]!=0:
M[i,j]=M[i,j]-M[i,k]*M[j,k];
for i in range(n):
for j in range((i+1),n):
M[i,j]=0;
return M;
     Generated a small 10 by 10 matrix to test incomplete cholesky decomposition.
n=10; m=10;
A=SPD(n,m);
df3=pd.DataFrame(A)
df3
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 20.785799 | -1.778591 | -2.070010 | 6.872917 | 2.918517 | 3.767847 | 4.300331 | -4.037706 | -7.738089 | 7.451273 |
| 1 | -1.778591 | 3.749765 | 2.301858 | -2.268078 | 2.983920 | -0.590407 | 1.283308 | -0.143865 | -2.339834 | -0.705777 |
| 2 | -2.070010 | 2.301858 | 6.663856 | 1.771233 | -0.232366 | 1.065979 | 1.017149 | 1.072569 | 2.890032 | -3.907837 |
| 3 | 6.872917 | -2.268078 | 1.771233 | 10.255555 | 4.725299 | 4.074437 | 2.309214 | 0.666974 | 6.183271 | 0.379173 |
| 4 | 2.918517 | 2.983920 | -0.232366 | 4.725299 | 17.507754 | 3.616163 | 3.724373 | 2.237717 | -1.621177 | -1.463696 |
| 5 | 3.767847 | -0.590407 | 1.065979 | 4.074437 | 3.616163 | 5.140977 | 0.784688 | -0.912350 | 0.079583 | -3.333271 |
| 6 | 4.300331 | 1.283308 | 1.017149 | 2.309214 | 3.724373 | 0.784688 | 2.802046 | -0.621201 | -1.266005 | 0.337485 |
| 7 | -4.037706 | -0.143865 | 1.072569 | 0.666974 | 2.237717 | -0.912350 | -0.621201 | 3.619154 | 2.706780 | -2.103359 |
| 8 | -7.738089 | -2.339834 | 2.890032 | 6.183271 | -1.621177 | 0.079583 | -1.266005 | 2.706780 | 16.016800 | -2.358960 |
| 9 | 7.451273 | -0.705777 | -3.907837 | 0.379173 | -1.463696 | -3.333271 | 0.337485 | -2.103359 | -2.358960 | 14.606818 |
M=ichol(A)
df4=pd.DataFrame(M)
df4
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 4.559145 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
| 1 | -0.390115 | 1.896728 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
| 2 | -0.454035 | 1.120210 | 2.280973 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
| 3 | 1.507501 | -0.885725 | 1.511587 | 2.216662 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
| 4 | 0.640146 | 1.704858 | -0.811722 | 2.931119 | 2.222853 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
| 5 | 0.826437 | -0.141297 | 0.701233 | 0.741411 | 0.775605 | 1.671835 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
| 6 | 0.943232 | 0.870593 | 0.206124 | 0.607590 | 0.010224 | -0.283979 | 0.813650 | 0.000000 | 0.000000 | 0.00000 |
| 7 | -0.885628 | -0.258003 | 0.420645 | 0.513248 | 0.936437 | -0.968213 | -0.300263 | 0.650674 | 0.000000 | 0.00000 |
| 8 | -1.697268 | -1.582707 | 1.706454 | 2.147647 | -1.235453 | -0.342169 | -0.034859 | -0.322178 | 1.165465 | 0.00000 |
| 9 | 1.634358 | -0.035951 | -1.370252 | -0.020396 | -1.575054 | -1.490241 | -1.579371 | -0.799944 | -0.024402 | 1.48992 |
     Here is the left preconditioned version of CG.
def PCGleft(A,b,x,tol):
r=b-A@x;
M=ichol(A);
z=np.linalg.solve(M,r);
p=z; i=0;
while np.linalg.norm(r) >= tol*np.linalg.norm(b):
Ap=A@p;
alpha=(r.T@z)/(Ap.T@p);
x=x+alpha*p;
r_temp=r;
z_temp=z;
r=r-(alpha*Ap);
z=np.linalg.solve(M,r);
beta=(r.T@z)/(r_temp.T@z_temp);
p=z+beta*p;
i=i+1;
if (i % 1) ==0:
error=np.linalg.norm(b-A@x);
print("Iteration = " + str(i) + ", " + "Error = " + str(error));
return x;
n=100; m=100;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
x=PCGleft(A,b,x,tol)
Iteration = 1, Error = 1069.7565098543462
     Here is the split preconditioned version of CG.
def PCGsplit(A,b,x0,tol):
r=b-A@x0; maxit=10000;
M=ichol(A);
z=np.linalg.solve(M,r);
gamma=(z.T@r)/(z.T@A@z);
x=gamma*z+x0; w=1; i=0;
xTemp0=x0;
while np.linalg.norm(r) >= tol*np.linalg.norm(b):
r=b-A@x;
zTemp=z; xTemp1=x;
z=np.linalg.solve(M,r);
gammaTemp=gamma;
gamma=(z.T@r)/(z.T@A@z);
w=(1-((gamma/gammaTemp)*((z.T@r)/(zTemp.T@M@zTemp))*(1/w)));
v=gamma*z+xTemp1-xTemp0;
u=w*v;
x=xTemp0+u;
xTemp0=xTemp1;
xTemp1=x;
i=i+1;
if (i % 1) ==0:
error=np.linalg.norm(b-A@x);
print("Iteration = " + str(i) + ", " + "Error = " + str(error));
if i == maxit:
break;
return x;
n=300; m=300;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
x=PCGsplit(A,b,x,tol)
Iteration = 1, Error = 298.3361529207821 Iteration = 2, Error = 204.72944231038954 Iteration = 3, Error = 129.26335027890008 Iteration = 4, Error = 161.4620031479076 Iteration = 5, Error = 851.1707397473474 Iteration = 6, Error = 900.7922608434445 Iteration = 7, Error = 129.24292237539856 Iteration = 8, Error = 24.5120420226455 Iteration = 9, Error = 4.26251738456976 Iteration = 10, Error = 0.8593802455085187 Iteration = 11, Error = 0.16015346891559123 Iteration = 12, Error = 0.03476858399231389 Iteration = 13, Error = 0.007016263020111355 Iteration = 14, Error = 0.0016601527157812893 Iteration = 15, Error = 0.00036793485293862423 Iteration = 16, Error = 9.64839156547177e-05 Iteration = 17, Error = 2.3959320163588707e-05 Iteration = 18, Error = 7.135183205324447e-06 Iteration = 19, Error = 2.0466076794224895e-06 Iteration = 20, Error = 7.196314063490302e-07 Iteration = 21, Error = 2.510088677011322e-07 Iteration = 22, Error = 1.1188219149093892e-07 Iteration = 23, Error = 5.2668472089858884e-08 Iteration = 24, Error = 3.517921356070309e-08 Iteration = 25, Error = 3.05704617716336e-08 Iteration = 26, Error = 7.098783029811836e-08 Iteration = 27, Error = 1.4444289350148972e-07 Iteration = 28, Error = 5.755822056095987e-08 Iteration = 29, Error = 1.0813612907861421e-07 Iteration = 30, Error = 9.061729961677658e-07 Iteration = 31, Error = 4.593887419163355e-07 Iteration = 32, Error = 4.975117302353126e-08 Iteration = 33, Error = 5.084889559990307e-09 Iteration = 34, Error = 5.89201182119404e-10 Iteration = 35, Error = 6.161614422285384e-11
Coding.5 Implement the CGS (e.g. conjugate gradient squared method) and BICGSTAB (e.g. Biconjugate Gradient Stabilized method), both should be the non-precondition versions of these methods, and following the implementation of the non-precondition versions of these methods, then implement the precondition versions: note, preconditioned by zero-fill $\text{ilu}$.
     Solution:
     Here is an unstable implementation of CGS.
def cgs(A,b,x,tol):
flag=0; maxit=10000;
bnrm2=np.linalg.norm(b);
if bnrm2==0.0:
bnrm2=1;
r=b-A@x;
error=np.linalg.norm(r)/bnrm2;
if error < tol:
return x;
r_tld=r;
for iter in range(maxit):
rho=(r_tld.T@r);
if rho == 0.0:
print("error: rho became zero, so the algorithm was stopped.")
break;
if iter > 0:
beta=rho/rho_1;
u=r+beta*q;
p=r+beta*(q+beta*p);
else:
u=r;
p=u;
p_hat=p
v_hat=A@p_hat;
alpha=rho/(r_tld.T@v_hat);
q=u-(alpha*v_hat);
u_hat=(u+q);
x=x+(alpha*u_hat);
r=r-(alpha*A@u_hat);
error=np.linalg.norm(r)/bnrm2;
if error < tol:
break;
rho_1=rho;
if (iter % 10) ==0:
error=np.linalg.norm(b-A@x);
print("Iteration = " + str(iter) + ", " + "Error = " + str(error));
if error < tol:
flog=1;
elif rho==0.0:
flag=-1;
else:
flag=0;
return x;
n=10; m=10;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
x=cgs(A,b,x,tol)
Iteration = 0, Error = 10.53494558731843 Iteration = 10, Error = 1.7256271557634575 Iteration = 20, Error = 1.1571307442229524 Iteration = 30, Error = 2.806251621610245 Iteration = 40, Error = 1.9752424262087536 Iteration = 50, Error = 0.5632942615071121 Iteration = 60, Error = 0.0919493215015062 Iteration = 70, Error = 0.05923156291597769 Iteration = 80, Error = 0.059228578366798616 Iteration = 90, Error = 0.05922857294303022 Iteration = 100, Error = 0.0592285729547866 Iteration = 110, Error = 0.059228572954784106 error: rho became zero, so the algorithm was stopped.
x=cgs(A,b,x,tol)
Iteration = 0, Error = 3.432853910643634 Iteration = 10, Error = 1.6840695671647112 Iteration = 20, Error = 0.6830353535660242 Iteration = 30, Error = 1.0260904218716613 Iteration = 40, Error = 1.469067596100978 Iteration = 50, Error = 1.340208709268638 Iteration = 60, Error = 0.9616990217355383 Iteration = 70, Error = 0.6741012717097074 Iteration = 80, Error = 0.4659188857745712 Iteration = 90, Error = 0.23859493372831825 Iteration = 100, Error = 0.04345457009945859 Iteration = 110, Error = 0.007825236985434384 Iteration = 120, Error = 0.007817271636537481 Iteration = 130, Error = 0.007817282194387173 Iteration = 140, Error = 0.007817282199430684 Iteration = 150, Error = 0.007817282199430986 error: rho became zero, so the algorithm was stopped.
x=cgs(A,b,x,tol)
Iteration = 0, Error = 0.001498347896481122 Iteration = 10, Error = 0.004526851341240384 Iteration = 20, Error = 0.00715552334929545 Iteration = 30, Error = 0.0051169262051169425 Iteration = 40, Error = 0.0031531279502610963 Iteration = 50, Error = 0.0015546659773898494 Iteration = 60, Error = 0.000474116538656269 Iteration = 70, Error = 0.00022650372809981983 Iteration = 80, Error = 0.00022645213347779593 Iteration = 90, Error = 0.00022645213603078845 Iteration = 100, Error = 0.00022645213602885664 Iteration = 110, Error = 0.00022645213602885664 error: rho became zero, so the algorithm was stopped.
x=cgs(A,b,x,tol)
Iteration = 0, Error = 0.013154379694558951 Iteration = 10, Error = 0.011873537825198401 Iteration = 20, Error = 0.007486715327086097 Iteration = 30, Error = 0.003321499354919438 Iteration = 40, Error = 0.0027402129346744044 Iteration = 50, Error = 0.001415301576050847 Iteration = 60, Error = 0.00024487302897238086 Iteration = 70, Error = 3.936952223735394e-05 Iteration = 80, Error = 3.9162152879845274e-05 Iteration = 90, Error = 3.91623235125356e-05 Iteration = 100, Error = 3.916232365415987e-05 error: rho became zero, so the algorithm was stopped.
x=cgs(A,b,x,tol)
Iteration = 0, Error = 7.6163377620116575e-06 Iteration = 10, Error = 2.0878106170527924e-05 Iteration = 20, Error = 2.0560523301000405e-05 Iteration = 30, Error = 1.4759820547850368e-05 Iteration = 40, Error = 1.0020957190542835e-05 Iteration = 50, Error = 5.339935125692039e-06 Iteration = 60, Error = 2.1469075741725418e-06 Iteration = 70, Error = 7.471636801698677e-07 Iteration = 80, Error = 7.268370041023725e-07 Iteration = 90, Error = 7.268358034818111e-07 Iteration = 100, Error = 7.268358038382663e-07 Iteration = 110, Error = 7.268358038382663e-07 error: rho became zero, so the algorithm was stopped.
     Here is the a stable implementation of CGS.
def conjgs(A,b,x,tol):
i=0;
r=b-A@x; r_tld=r;
u=r; p=u;
while np.linalg.norm(r) > tol:
Ap=A@p;
alpha=(r.T@r_tld)/(Ap.T@r_tld);
q=u-alpha*Ap;
x=x+(alpha*(u+q));
r_temp=r;
r=r-alpha*A@(u+q);
if (i % 10) ==0:
error=np.linalg.norm(b-A@x);
print("Iteration = " + str(i) + ", " + "Error = " + str(error));
if np.linalg.norm(r) < tol:
break;
beta=(r.T@r_tld)/(r_temp.T@r_tld);
u=r+beta*q;
p=u+beta*(q+beta*p);
i=i+1;
return x;
n=500; m=500;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
x=conjgs(A,b,x,tol)
Iteration = 0, Error = 2871.8533561406994 Iteration = 10, Error = 31.36421580436442 Iteration = 20, Error = 9.078649490978263 Iteration = 30, Error = 4.913006953442978 Iteration = 40, Error = 3.3210054551135 Iteration = 50, Error = 2.414355248071721 Iteration = 60, Error = 1.8890828955825525 Iteration = 70, Error = 1.541811651131636 Iteration = 80, Error = 1.3234133107700905 Iteration = 90, Error = 31.865414905153628 Iteration = 100, Error = 7.431167764037988 Iteration = 110, Error = 1.1056857492886387 Iteration = 120, Error = 1.0552903231302382 Iteration = 130, Error = 10.729537663980441 Iteration = 140, Error = 0.9191983637280811 Iteration = 150, Error = 0.9132605941981181 Iteration = 160, Error = 0.942138024271691 Iteration = 170, Error = 1.0959846574630983 Iteration = 180, Error = 1.1050147019382746 Iteration = 190, Error = 1.1379793941140885 Iteration = 200, Error = 1.2720233300429067 Iteration = 210, Error = 1.2387998470163006 Iteration = 220, Error = 1.6601489111954042 Iteration = 230, Error = 1.8626799788714932 Iteration = 240, Error = 54.450029099465134 Iteration = 250, Error = 276.3395241717533 Iteration = 260, Error = 2.691902617976994 Iteration = 270, Error = 1.8734484015103146 Iteration = 280, Error = 684.5770560555 Iteration = 290, Error = 5.191293494161494 Iteration = 300, Error = 2.41509250673755 Iteration = 310, Error = 1.6729301795544076 Iteration = 320, Error = 2.379195165598295 Iteration = 330, Error = 6.542402713935021 Iteration = 340, Error = 1.5302975965368781 Iteration = 350, Error = 2.263994113952519 Iteration = 360, Error = 1.0872366516522045 Iteration = 370, Error = 0.8279058713020214 Iteration = 380, Error = 0.9043255420408243 Iteration = 390, Error = 0.709887072866221 Iteration = 400, Error = 0.6492440381893032 Iteration = 410, Error = 22.12014251881592 Iteration = 420, Error = 0.5882298689752052 Iteration = 430, Error = 0.5778620399878075 Iteration = 440, Error = 0.6987164500508306 Iteration = 450, Error = 0.605389538011253 Iteration = 460, Error = 0.5548502116316032 Iteration = 470, Error = 0.6421820597227693 Iteration = 480, Error = 0.7871969836558265 Iteration = 490, Error = 0.7916684521900951 Iteration = 500, Error = 2.124489290429589 Iteration = 510, Error = 4374.639848521565 Iteration = 520, Error = 1.496258597916121 Iteration = 530, Error = 1.6388036619007234 Iteration = 540, Error = 2.2307579736401357 Iteration = 550, Error = 2.4413920486461174 Iteration = 560, Error = 2.8361004168138852 Iteration = 570, Error = 4.12285507733666 Iteration = 580, Error = 68.16686871100134 Iteration = 590, Error = 6.521616436129132 Iteration = 600, Error = 79.39204736008958 Iteration = 610, Error = 45.354679347856354 Iteration = 620, Error = 72.7747135219278 Iteration = 630, Error = 5056.239354008007 Iteration = 640, Error = 43.49933771241108 Iteration = 650, Error = 65.21997700752912 Iteration = 660, Error = 77.39613682709816 Iteration = 670, Error = 24501.91741179567 Iteration = 680, Error = 978.0054301946936 Iteration = 690, Error = 815.6746884046746 Iteration = 700, Error = 8724.516166719017 Iteration = 710, Error = 764.6496208281809 Iteration = 720, Error = 864.0988962190866 Iteration = 730, Error = 1238.9753043573814 Iteration = 740, Error = 1281.7077012205693 Iteration = 750, Error = 1648.2111942917113 Iteration = 760, Error = 885.8360052078173 Iteration = 770, Error = 932.6003236033989 Iteration = 780, Error = 686.9259970349867 Iteration = 790, Error = 1514.615514209715 Iteration = 800, Error = 553.7442305945425 Iteration = 810, Error = 479.3901650555783 Iteration = 820, Error = 3006.243504547265 Iteration = 830, Error = 349162.13597954356 Iteration = 840, Error = 345.78891344284597 Iteration = 850, Error = 305.0490871867664 Iteration = 860, Error = 290.2504366579676 Iteration = 870, Error = 277.11060502547184 Iteration = 880, Error = 1178.7604825260398 Iteration = 890, Error = 1110.0125890768659 Iteration = 900, Error = 271.47469415093667 Iteration = 910, Error = 847.5085128593391 Iteration = 920, Error = 1550.1984352769516 Iteration = 930, Error = 472.5121860470033 Iteration = 940, Error = 510.16419947003294 Iteration = 950, Error = 6.400213402899035 Iteration = 960, Error = 3.376172443446102 Iteration = 970, Error = 273.78730097487227 Iteration = 980, Error = 10.383891089882962 Iteration = 990, Error = 6.308519187799503 Iteration = 1000, Error = 5.506099515352659 Iteration = 1010, Error = 0.3706205564623151 Iteration = 1020, Error = 0.020466678802695263 Iteration = 1030, Error = 0.018638563180223642 Iteration = 1040, Error = 0.0036157174393165875 Iteration = 1050, Error = 0.0010072135677585644 Iteration = 1060, Error = 0.0008475938515412747 Iteration = 1070, Error = 0.0007484653633967388 Iteration = 1080, Error = 0.0003498620996699288 Iteration = 1090, Error = 0.00010896478755981692 Iteration = 1100, Error = 0.0007314038747101037 Iteration = 1110, Error = 0.004809618854439786 Iteration = 1120, Error = 1.8539717371558007e-05 Iteration = 1130, Error = 0.0005117893425112399 Iteration = 1140, Error = 0.0014182039685604897 Iteration = 1150, Error = 0.0009370521874243284 Iteration = 1160, Error = 0.1788175292433678 Iteration = 1170, Error = 0.11743185244486795 Iteration = 1180, Error = 7.144582193302272e-07 Iteration = 1190, Error = 4.235957250424369e-06 Iteration = 1200, Error = 8.392211638232312e-05 Iteration = 1210, Error = 9.591936017633436e-06 Iteration = 1220, Error = 0.0015201205626032205 Iteration = 1230, Error = 0.00020665803136557528 Iteration = 1240, Error = 0.0005136073042430838 Iteration = 1250, Error = 0.00011223080372260021 Iteration = 1260, Error = 0.0002363492571607938 Iteration = 1270, Error = 0.0002466277708989337 Iteration = 1280, Error = 0.00017014215244989868 Iteration = 1290, Error = 7.448198287658125e-05 Iteration = 1300, Error = 0.0003061536056816067 Iteration = 1310, Error = 0.0001455331877065224 Iteration = 1320, Error = 8.075993402110253e-05 Iteration = 1330, Error = 1.1660678882556112e-06 Iteration = 1340, Error = 3.5640549130418813e-06 Iteration = 1350, Error = 9.036478872261653e-06 Iteration = 1360, Error = 2.2185085606806796e-06 Iteration = 1370, Error = 9.083163245060318e-07 Iteration = 1380, Error = 4.292628187531542e-06 Iteration = 1390, Error = 0.0027149697758966306 Iteration = 1400, Error = 7.608449926350945e-08 Iteration = 1410, Error = 1.8323699477593928e-08 Iteration = 1420, Error = 4.610103705923063e-06 Iteration = 1430, Error = 5.523061363113384e-05 Iteration = 1440, Error = 1.645244008882812e-07 Iteration = 1450, Error = 1.2028648834917916e-07 Iteration = 1460, Error = 1.524820825767596e-08 Iteration = 1470, Error = 2.332083354341209e-07 Iteration = 1480, Error = 1.975158599168492e-08 Iteration = 1490, Error = 2.4005445899167873e-08 Iteration = 1500, Error = 3.926302299692695e-08 Iteration = 1510, Error = 1.7569269712745813e-08 Iteration = 1520, Error = 2.1292650388105363e-08 Iteration = 1530, Error = 2.9877728478169306e-07 Iteration = 1540, Error = 2.4003492867959115e-07 Iteration = 1550, Error = 3.1087678621854516e-08 Iteration = 1560, Error = 1.6006999206604675e-08 Iteration = 1570, Error = 2.1828147470726827e-08 Iteration = 1580, Error = 3.501216199070098e-08 Iteration = 1590, Error = 5.125302912245167e-07 Iteration = 1600, Error = 8.139377305195574e-07 Iteration = 1610, Error = 2.749076187159222e-08 Iteration = 1620, Error = 2.3779271173651045e-07 Iteration = 1630, Error = 6.407962283501196e-08 Iteration = 1640, Error = 4.275273158720547e-07 Iteration = 1650, Error = 1.9350013702377336e-07 Iteration = 1660, Error = 6.410810070950382e-06 Iteration = 1670, Error = 5.526025196460061e-06 Iteration = 1680, Error = 5.604694657905412e-05 Iteration = 1690, Error = 8.395282144398153e-05 Iteration = 1700, Error = 0.00034981002154852727 Iteration = 1710, Error = 0.00027369223417238723 Iteration = 1720, Error = 0.0001009482349792831 Iteration = 1730, Error = 0.00010886097802131508 Iteration = 1740, Error = 2.8234473579397822e-05 Iteration = 1750, Error = 2.52410249863342e-05 Iteration = 1760, Error = 0.00033399234656777315 Iteration = 1770, Error = 0.10243933109402723 Iteration = 1780, Error = 0.00020298352404806612 Iteration = 1790, Error = 0.0006365648517026071 Iteration = 1800, Error = 0.0026217311698021536 Iteration = 1810, Error = 0.0017766134542645485 Iteration = 1820, Error = 0.002472384894721338 Iteration = 1830, Error = 0.002111409263232113 Iteration = 1840, Error = 0.06096739283717649 Iteration = 1850, Error = 0.00816382278602135 Iteration = 1860, Error = 0.012403554807568436 Iteration = 1870, Error = 0.0015062529589873505 Iteration = 1880, Error = 0.002483955371052914 Iteration = 1890, Error = 0.0018544088095495666 Iteration = 1900, Error = 0.0008625088401928971 Iteration = 1910, Error = 0.0008573380971639266 Iteration = 1920, Error = 0.00026209466197798945 Iteration = 1930, Error = 0.0014744134228868728 Iteration = 1940, Error = 9.905579772436665e-06 Iteration = 1950, Error = 8.228321707729318e-05 Iteration = 1960, Error = 0.00013410423216279129 Iteration = 1970, Error = 1.6399987169313358e-05 Iteration = 1980, Error = 0.00010123835796733025 Iteration = 1990, Error = 9.831808114628423e-06 Iteration = 2000, Error = 0.004920460224620118 Iteration = 2010, Error = 3.546494355032547e-06 Iteration = 2020, Error = 1.6826372240557554e-06 Iteration = 2030, Error = 1.936899817074765e-06 Iteration = 2040, Error = 2.0234266401886987e-06 Iteration = 2050, Error = 3.2425545591044164e-06 Iteration = 2060, Error = 2.6327212026510315e-06 Iteration = 2070, Error = 3.1515153755561133e-06 Iteration = 2080, Error = 4.596431033524252e-05 Iteration = 2090, Error = 0.00022213578533486698 Iteration = 2100, Error = 0.0009578063449814345 Iteration = 2110, Error = 1.0264829629445676e-06 Iteration = 2120, Error = 1.2380972037575447e-07 Iteration = 2130, Error = 1.7165573670359022e-06 Iteration = 2140, Error = 3.79851788607805e-06 Iteration = 2150, Error = 2.085827929735561e-07 Iteration = 2160, Error = 4.763201223300193e-07 Iteration = 2170, Error = 4.316055885049705e-07 Iteration = 2180, Error = 2.489370859762477e-08 Iteration = 2190, Error = 5.007339419816711e-08 Iteration = 2200, Error = 1.784751350767656e-08 Iteration = 2210, Error = 1.7852559701651874e-08 Iteration = 2220, Error = 1.7993337760719194e-08 Iteration = 2230, Error = 1.8005084751082344e-08 Iteration = 2240, Error = 4.877649557268009e-08 Iteration = 2250, Error = 1.8578089641745266e-08
     Here is the preconditioned stable implementation of CGS.
def Pconjgs(A,b,x,tol):
i=0;
M=ichol(A);
r=b-A@x; r_tld=r;
u=r; p=u;
error=np.linalg.norm(b-A@x);
print("Iteration = " + str(i) + ", " + "Error = " + str(error));
while np.linalg.norm(r) > tol:
p=np.linalg.solve(M,p);
Ap=A@p;
alpha=(r.T@r_tld)/(Ap.T@r_tld);
q=u-alpha*Ap;
u_hat=np.linalg.solve(A,(u+q));
x=x+(alpha*u_hat);
r_temp=r;
r=r-alpha*A@u_hat;
if (i % 1) ==0:
error=np.linalg.norm(b-A@x);
print("Iteration = " + str(i+1) + ", " + "Error = " + str(error));
if np.linalg.norm(r) < tol:
break;
beta=(r.T@r_tld)/(r_temp.T@r_tld);
u=r+beta*q;
p=u+beta*(q+beta*p);
i=i+1;
return x;
n=500; m=500;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
x=Pconjgs(A,b,x,tol)
Iteration = 0, Error = 535.9690538939683 Iteration = 1, Error = 3.989678650926232e-13
     Here is Biconjugate gradient stable method.
def BICGStab(A,b,x,tol):
i=0;
r=b-A@x; r_tld=r;
p=r;
error=np.linalg.norm(r);
print("Iteration = " + str(i) + ", " + "Error = " + str(error));
while np.linalg.norm(r) > tol:
Ap=A@p;
alpha=(r.T@r_tld)/(Ap.T@r_tld);
s=r-alpha*Ap;
As=A@s;
w=(As.T@s)/(As.T@As);
x=x+alpha*p+w*s;
r=s-w*As;
if (i % 100) ==0:
error=np.linalg.norm(r);
print("Iteration = " + str(i+200) + ", " + "Error = " + str(error));
if np.linalg.norm(r) < tol:
break;
beta=(alpha/w)*((r.T@r_tld)/(r.T@r_tld));
p=r+(beta*(p-w*Ap));
i=i+1;
return x;
n=10; m=10;
A=SPD(n,m);
b=np.random.randn(m);
x=np.random.randn(m);
tol=1e-10;
x=BICGStab(A,b,x,tol)
Iteration = 0, Error = 78.24071092333004 Iteration = 200, Error = 5.874309359633821 Iteration = 300, Error = 0.47475361274814226 Iteration = 400, Error = 0.10885243400109348 Iteration = 500, Error = 97.97629281429144 Iteration = 600, Error = 2.3646303299740654 Iteration = 700, Error = 255.57337995232626 Iteration = 800, Error = 1721.1678957916708 Iteration = 900, Error = 13356.32869331797 Iteration = 1000, Error = 232.19900817148823 Iteration = 1100, Error = 16.42767335754242 Iteration = 1200, Error = 1.0585325393049663 Iteration = 1300, Error = 0.583916131018741 Iteration = 1400, Error = 0.04131735151077903 Iteration = 1500, Error = 0.02999332970571374 Iteration = 1600, Error = 2.1380670430141873 Iteration = 1700, Error = 24.370666346844352 Iteration = 1800, Error = 4.940168000340099 Iteration = 1900, Error = 0.7267465503736613 Iteration = 2000, Error = 0.11091789979900095 Iteration = 2100, Error = 1.7739516716745949 Iteration = 2200, Error = 0.5461805325965746 Iteration = 2300, Error = 0.2865959838148986 Iteration = 2400, Error = 1.1934899449410343 Iteration = 2500, Error = 0.031804049525613935 Iteration = 2600, Error = 0.0045880870373940135 Iteration = 2700, Error = 0.009052848038424254 Iteration = 2800, Error = 6.872249948263537e-05 Iteration = 2900, Error = 0.0022283991647527365 Iteration = 3000, Error = 8.914066012822276e-05 Iteration = 3100, Error = 7.796994909237536e-06 Iteration = 3200, Error = 2.758942480377419e-07 Iteration = 3300, Error = 3.991373317992016e-09 Iteration = 3400, Error = 9.766761475095392e-09 Iteration = 3500, Error = 4.117043930569576e-09 Iteration = 3600, Error = 1.325082220185301e-10
     Here is preconditioned Biconjugate gradient stable method.
def PBICGStab(A,b,x,tol):
i=0;
M=ichol(A);
r=b-A@x; r_tld=r;
p=r;
error=np.linalg.norm(r);
print("Iteration = " + str(i) + ", " + "Error = " + str(error));
while np.linalg.norm(r) > tol:
p=np.linalg.solve(M,p);
Ap=A@p;
alpha=(r.T@r_tld)/(Ap.T@r_tld);
s=r-alpha*Ap;
s=np.linalg.solve(M,s);
As=A@s;
w=(As.T@s)/(As.T@As);
x=x+alpha*p+w*s;
r=s-w*As;
if (i % 1) ==0:
error=np.linalg.norm(r);
print("Iteration = " + str(i+1) + ", " + "Error = " + str(error));
if np.linalg.norm(r) < tol:
break;
beta=(alpha/w)*((r.T@r_tld)/(r.T@r_tld));
p=r+(beta*(p-w*Ap));
i=i+1;
return x;
n=500; m=500;
A=SPD(n,m);
b=np.random.randn(n);
x=np.random.randn(m);
tol=1e-10;
x=PBICGStab(A,b,x,tol)
Iteration = 0, Error = 520.820626322756 Iteration = 1, Error = 5.163480703387725e-13
And lastly a art piece of Kromia!